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Abstract 
Though schools do not track in Brazil, I find that black/white classroom segregation in Brazil is 
greater than recent estimates from North Carolina high schools (Clotfelter et al., 2020). How 
does race-based classroom segregation occur without tracking, and in a supposed “racial 
paradise,” no less? Using national, student-level data spanning from 2011 to 2017, I describe 
racial classroom segregation among Brazilian 5th and 9th graders and assess potential 
mechanisms identified in the literature. The findings are consistent with a segregation by chance 
regime in which (1) schools typically assign students to classrooms arbitrarily, producing initial 
assignments that are sometimes segregated by chance, and (2) schools choose to move forward 


with the racially segregated “draws” rather than make race-conscious adjustments. 
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Introduction 
Classroom segregation — how the grouping of students for whole-class instruction maps onto 
student characteristics — has long concerned education and inequality scholars who argue that it 
enables differential treatment within schools, particularly along racial and economic lines 
(Bowles & Gintis, 1976; Mickelson, 2001). To date, researchers have focused primarily on 
classroom segregation that occurs as a direct or downstream consequence of tracking, a practice 
in which students are segregated by perceived ability for differentiated instruction, typically 
involving explicit status markers denoting “high ability” versus “low ability” classrooms. This 
may entail assigning students to a suite of classrooms across many subjects or tracking may be 
differentiated across subjects to — at least ostensibly — allow a student to be assigned to high- 
track classrooms in some subjects and low-track classrooms in others (Lucas & Berends, 2002). 

US high schools are particularly known for classroom segregation by race due to the use 
of tracking and the charged debate surrounding it. A recent study by Clotfelter et al. (2020) 
measured racial and ethnic segregation within schools and between classrooms (i.e. classroom 
segregation) and segregation within counties and between schools (i.e. school segregation) for 
North Carolina’s 10 graders in 2017. They report the total white/black segregation, summing 
classroom and school segregation, to have a Dissimilarity Index score of .52 in math, of which 
nearly 40% is due to classroom segregation (D = .20). 

Brazil prides itself on higher cross-race interaction and the absence of de jure segregation 
in its history, with political leaders often evoking a favorable comparison to US segregationism 
and racial conflict (Telles, 2004). Yet repeating the Clotfelter et al. analysis in Brazil’s public 
schools reveals that the total white/black segregation of Brazil’s 5" and 9" graders is roughly on 


par with that of US 4" and 10" graders. Even more surprising is that classroom segregation in 
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both 5" (D = .29) and 9" (D = .25) grade in Brazil is greater than in US high schools (D = .20), 
despite Brazil not using classroom-level tracking. This highlights the possibility that non- 
tracking school systems are not exempt from becoming highly classroom segregated. 

How does race-based classroom segregation occur without tracking, and in a supposed 
“racial paradise,” no less? I contend that this phenomenon is rooted in (1) the ideological and 
historical differences between the US and Brazil that cause racial segregation to face different 
barriers to legitimacy in each, and (2) the potential potency of chance as a segregating force 
when a society is in denial about race’s social reality. 

The analysis proceeds by describing the extent of racial classroom segregation in Brazil; 
comparing the observed data to simulated datasets in which students are assigned to classrooms 
by random assignment, age sorting, or achievement sorting; and estimating associations between 
classroom segregation and indicators of classroom sorting mechanisms. The findings are 
consistent with segregation that occurs due to arbitrary assignment rather than the age sorting, 
achievement sorting, teacher steering, and parent lobbying mechanisms that have been identified 
in the literature. Racial segregation by chance is congruent with the hypothesis that racial 
classroom segregation without tracking is made possible in Brazil due to antiracialism and 
racism denial rooted in the myth of “racial democracy.” 

Classroom Segregation without Tracking? “It’s Unimaginable.” 
The absence of tracking in Brazil appears to promote the assumption that there is no classroom 
segregation. When I interviewed a former state secretary of education in 2017, he explained to 
me that students are not segregated within Brazilian schools. He recounted a story about 
prejudice causing between-school racial and economic segregation and then continued, “But 


[segregation] in between [classrooms]? One school — difference between classes, classrooms, 
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and so on — it’s almost — it’s unimaginable at the moment for me” (June 7, 2017). Another state 
secretary of education I interviewed noted that her state has no classroom assignment guidelines, 
yet was adamant that classrooms are not segregated by race in her state. When asked if she had 
heard of classroom segregation elsewhere in Brazil, she quipped, “Aqui nos Estados Unidos” 
(“Here in the United States”) (June 7, 2017). She later explained, unprompted, that there is no 
tracking in Brazil. These interviews comport with dozens of informal interactions I had with 
state and municipal education administrators while triangulating my findings. The common 
belief appears to be that Brazil does not track, therefore there is no classroom segregation. 

Tracking is ever-present in the international literature on classroom-level segregation. 
Yet Gamoran’s (2010) international review lists only six countries that track within schools. 
Many nations sort between schools rather than within them (Hanushek & Woessmann, 2006) and 
tracking countries like the US only track in some schools and at some grade levels. However, 
tracking is a crucial feature of US educational discourse, having come into fashion as a response 
to the racial integration of schools (Mickelson, 2001) and remained the topic of a bitter debate 
that some call the “tracking wars” (e.g., Loveless, 2011). That discourse has so dominated the 
classroom segregation literature that tracking is now the primary framework available for 
understanding classroom segregation. It is unclear whether classroom segregation does not occur 
without tracking, as my interviewees seem to have concluded, or if classroom segregation only 
appears to be an epiphenomenon of tracking because of narrow case selection in the literature. 

One non-tracking context that has received attention is US elementary schools. Though 
few classroom segregation analyses include US elementary schools, those that do consistently 
find low racial segregation (Clotfelter et al., 2003, 2008, 2020; Conger, 2005; Kalogrides & 


Loeb, 2013; Morgan & McPartland, 1981). In fact, two of these studies offer evidence that at 
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least some US elementary schools proactively balance their classrooms on racial lines. As I 
discuss below, random classroom assignment can produce meaningful racial segregation. 
Clotfelter et al. (2003, 2008) find that classrooms in North Carolina’s elementary schools are 
often less racially segregated than would have occurred under random assignment, indicating 
that there may be intentional balancing efforts. This is strikingly exceptional given the 
persistence of racial segregation throughout US society, and supports the conclusion that 
widespread classroom segregation does not occur in non-tracking contexts. 
Pseudo-Tracking 
One possibility is that Brazilian schools are only nominally non-tracking. What little is known 
about racial classroom segregation in Brazil comes from a small literature focused on the 
possibility of pseudo-tracking (academically sorting students into classrooms without formally 
differentiated instruction) by test scores or age/grade distortion. Soares (2005) reports that 32% 
of the total achievement variation in Minas Gerais occurs at the classroom level, which is three 
times the amount at the school level. In a national study of 5" graders in 2009, de Oliveira et al. 
(2013) identify 10% of schools in which at least 33.4% of the variation within the school is 
between classrooms. In a study reported by Instituto Unibanco (2017), Mariana Leite identifies 
426 elementary schools across the country with substantial classroom segregation by test scores 
and reports that higher-performing classrooms are assigned more experienced teachers than 
lower-performing classrooms in the same school and grade. While only about five percent of 5th 
grade students and four percent of 9th grade students in my sample have principals who report 
assigning students to classrooms based on achievement, more may do so informally (Table 1). 
Other scholars consider sorting by age/grade distortion (the discrepancy between a 


student’s age and that expected at his/her grade level). Bartholo and de Costa (2014) find 
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evidence of age sorting in Rio de Janeiro’s public school system, although it is not within 
schools as they are defined in the present study. In Brazil, students are often divided into separate 
shifts that attend classes in the same institution at different times of day. In the present study, I 
define a school as an institution-specific shift, as this is the population among which classroom 
assignments are made. Bartholo and de Costa (2014) find substantial shift segregation — 
segregation between schools within school administrations — by race and class that results from 
selecting students into shifts according to age/grade distortions. An earlier study by de Costa and 
Koslinski (2006) suggests this process also occurs at the classroom level; they found Rio de 
Janeiro schools dividing their classrooms by age and making exceptions for high-income and 
high-achieving students. Principals frequently indicate that they age sort classrooms; about 35% 
of 5th graders and 37% of 9th graders in my sample have principals who report age sorting 
(Table 1). Altogether, these studies indicate that Brazilian schools may be sorting students on 
academic criteria as a pseudo-tracking assignment practice. However, it remains unclear whether 
either practice promotes substantial racial segregation at a national scale. 

Teacher Steering and Parent Lobbying 

Another possibility is that secondary mechanisms of segregation under tracking promote 
segregation in non-tracking contexts. Tracking is approached as both a primary mechanism of 
classroom segregation and a context that promotes secondary, segregation-exacerbating 
mechanisms. The latter are the focus of a subarea of the tracking literature that considers whether 
and why schools are more racially and economically segregated than academic differences 
predict. Though some studies do not find exacerbated segregation (Garet & DeLany, 1988; 
Haller, 1985; Haller & Davis, 1981), a substantial scholarship does. These scholars explain this 


“knock-on” segregation with consideration of how status influences a dynamic classroom 
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assignment process, showing that classroom segregation is influenced by biased assessments of 
ability, parent lobbying for classroom assignments, teacher steering during the assignment 
process, and schools competing for the enrollment of advantaged students (Delany, 1991; 
Grissom et al., 2015; Lewis & Diamond, 2015; Oakes & Guiton, 1995; Watanabe, 2008). 
Altogether, this scholarship argues that, as Oakes and Guiton (1995) put it, “irregularities favor 
the advantaged” (p.26) when it comes to classroom assignment. 

Of these secondary segregation mechanisms, teacher steering and parent lobbying are 
most likely to occur in non-tracking schools. Grissom et al. (2015) describe the micropolitics of 
classroom assignment in which teachers compete for particular students, resulting in lower-status 
students tending to be in classrooms with newer and less effective teachers. Additionally, parent 
lobbying can also increase segregation, whether because racially privileged parents are more 
likely to lobby for classrooms (Delany, 1991; Oakes & Guiton, 1995) or because they lobby 
more successfully due to deference from school administrators (Lewis & Diamond, 2015). 

Segregation by Chance 
Another possible mechanism of classroom segregation in non-tracking contexts is segregation by 
chance. It has long been understood in the segregation measurement literature that segregation 
occurs under random assignment (Cortese et al., 1976). This segregation by chance (also called 
small-unit bias, index bias, random segregation, expected segregation, and random unevenness) 
can be substantial when assignment is highly stochastic and groups (i.e., racial groups) or units 
(i.e., classrooms) are small. This is akin to the problem of random sampling with a small N in 
which it is likely that important characteristics (e.g., race) will be unbalanced across treatment 
conditions (e.g., classroom) because the assignment variable, despite being random and 


uncorrelated with race on average, happens to be correlated with race in a given iteration. On 
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average, there is some imbalance, and this expected value of segregation under random 
assignment is a function of classroom and racial group sizes (Cortese et al., 1976). 

Thus, when schools group students into classrooms according to criteria that are 
uncorrelated with race, they can produce substantial segregation because classrooms are small 
samples of the school-grade population. While I spoke to one former principal who described 
using random number generators, in practice schools may approach assignment haphazardly or 
use arbitrary — rather than random — criteria like the alphabetical order of names. 

How Much Segregation Occurs by Chance? 

Random baselines are commonly used throughout the sciences as either bias corrections or non- 
zero null hypotheses when the expected value of a measure under random assignment is non- 
zero. The literature on segregation between units tends to differentiate segregation that must have 
been socially produced from that which could be due to chance (i.e., segregation net of the 
random baseline) through bias-correction or statistical testing (F. D. Blau, 1977; Bygren, 2013; 
Carrington & Troske, 1997; Cortese et al., 1976; Fossett, 2017; Winship, 1977). A similar 
scholarship on segregation in networks differentiates between a baseline model of homophily 
under random assortment and homophily which occurs net of the baseline (P. M. Blau, 1977; 
Fararo & Skvoretz, 1987; McPherson et al., 2001). 

The expected value of segregation under random assignment is often substantial when 
units are small (e.g., Bygren, 2013; Carrington & Troske, 1997). This is true in the present case; 
random assignment would produce as much racial classroom segregation in Brazil as would 
pseudo-tracking sorting practices. Figure 1 shows the distribution of racial classroom segregation 
in Brazilian public schools in four simulated assignment processes: random assignment, age 


sorting, strict sorting by test scores as though they are directly observed, and sorting based on a 
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noisy proxy of test scores (r = .75). The distribution of racial segregation is similar in each 
condition, with random assignment producing only slightly less segregation than age and 
achievement sorting. In the average school, the mean racial segregation after 50 random 
assignment draws is 70% of the observed 5" grade average and 86% of the observed 9" grade 
average (Table 1). Segregation by chance is potentially a potent source of classroom segregation. 

However, this analysis — like the random baselines used in prior studies — does not tell us 
whether substantial classroom segregation occurs by chance. The literature consistently 
considers segregation net of random baselines to enable researchers to focus on the remaining 
segregation, positioning segregation by chance as both asocial and inevitable (otherwise 
removing the random baseline overcorrects in cases with less stochastic assignment). This 
approach to segregation by chance is useful for certain questions, but leaves gaps in our 
understanding; I was unable to find any studies that investigate whether arbitrary assignment 
does — not just may — produce substantial segregation in schools or otherwise. 

This study departs from tradition and conceptualizes classroom segregation by chance as 
a social outcome that is impacted by schools’ decisions just as segregation from tracking is. 
Consider a school deciding whether to use race-stratified random classroom assignment 
(minimizing racial segregation) or to use simple random assignment. In the former case, racial 
segregation is predetermined and kept low. In the latter, it is an oft-segregating random draw 
from a set of possibilities based on the school’s racial composition and classroom sizes. Even 
when random assignment is used, schools can choose to have less segregation than would occur 
by chance; when they “draw” highly racially segregated assignments prior to starting the school 


year, they can rearrange students to provide a more balanced set of assignments or simply try 


CLASSROOM SEGREGATION WITHOUT TRACKING | 10 


another draw. Schools choose not to integrate classrooms, so segregation by chance must be 
understood as a practice to understand classroom segregation. 

This perspective is also useful for practitioners and policy makers. I have shown that 
similarly high levels of racial segregation would occur under random assignment as under age 
and achievement sorting. Those looking to reduce racial segregation in Brazil’s schools will be 
better equipped knowing not just how much more segregated classrooms are than they would be 
under random assignment, but also which assignment process is more commonly the culprit. 

Legitimacy and Segregation in the US and Brazil 
I turn now to considering how the US and Brazilian contexts may shape how classroom 
segregation occurs. I follow Weber’s (1978) descriptive account of legitimacy as the condition of 
being “approximately or on the average, oriented toward determinable ‘maxims’” such that a 
legitimate condition is understood to be accordant with broadly accepted norms and values, 
inducing an obligation to at least tolerate it (31). I define a logic as a narrative, drawn from 
extant cultural norms and myths, that renders a practice recognizable. A “legitimating logic” 
renders the practice recognizable as a right and proper way of doing things. 
In the United States 
A hallmark of the 20" century US is the expansion of and subsequent partial disbanding of a 
nationwide tapestry of policies promoting and enforcing de jure racial segregation. Starting with 
Brown vs The Board of Education of Topeka Kansas in 1954, school integration was a crucial 
site in the decades-long delegitimization of segregationism, and explicit racism more broadly, in 
the US. Due to school segregation’s special place in the nation’s relationship to racism, 
segregationism is a ready explanation for racial segregation along institutional boundaries in 


education. This makes legitimating logics crucial to sustaining racial segregation in schools; that 
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is, broad tolerance of segregation is conditional upon participants and onlookers recognizing it as 
occurring due to practices consistent with cultural narratives of acceptable segregation. 

This is hardly a substantial barrier to segregation along most institutional boundaries in 
education because placements in most institutional units are either commodified or ostensibly 
subject to student/parent agency, fitting dominant narratives in which segregation results from 
markets, cultural clash, and free choices. The residential segregation that produces substantial 
segregation across districts or neighborhood schools is construed as the result of “natural 
antagonism between ‘cultures’” (Nash, 2003) and fair, market forces rather than an intended 
consequence of government policies (Rothstein, 2017). Segregated friendship networks and 
cafeteria seating are chalked up to natural cultural differences expressed through student choices, 
ignoring institutional roles (Thomas, 2005). The primacy of individual choices renders most 
racial segregation in education as either an acceptable, if undesirable, consequence of respecting 
fundamental rights or a self-evidently optimal organization of collective preferences. 

Classroom segregation is particularly resistant to market, cultural clash, and free choice 
logics because classroom assignments are explicitly determined by schools, even if student and 
parent input is sought. Tracking provides a legitimating logic for the racial segregation it 
produces by framing segregation as an unfortunate byproduct of meritocracy, and this may 
explain why it is the dominant source of racial classroom segregation in the US. 

In parallel, concomitant with the delegitimization of segregationism and overt racism was 
the transformation from widespread explicit racism to a racism-denying ideology that positions 
undoing harm as unnecessary intervention (Bobo et al., 1997; Bonilla-Silva, 2006). This has 
constrained efforts at school integration; for example, in Charlotte-Mecklenburg, North Carolina, 


one of the nation’s most successful court-ordered desegregation programs was ordered stopped — 
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against the school district’s wishes — on the basis that “achieving diversity [was] not a proper 
grounds for race-conscious action” (Capacchione v. Charlotte-Mecklenburg Schools, 1999, p. 
291). This ruling is indicative of a contested space in which race-based educational integration is 
often pursued as self-evidently legitimate (1.e., the purpose is to “achiev[e] diversity”) and this 
legitimacy is challenged by “reverse discrimination” activists arguing that the US is a post-racial 
society and framing the consideration of race to redress racism as the real racism. Classroom 
integration efforts have not caught the attention of “reverse discrimination” activists, presumably 
because they are not prominent, which may explain why in some cases classrooms tend to be less 
segregated than would be expected by chance (Clotfelter et al., 2003, 2008). 

In Brazil 

When Brazil entered the 20" century, slavery had only recently been abolished, in 1888. 
Compared to the US, Brazil had a far greater population with both European and non-European 
ancestry, owing to the male-dominant demographics of Portuguese colonizers who more often 
had children (with, at best, dubious consent) with non-whites than the US colonizers who 
primarily migrated as families (Telles, 2004). Brazil was also in the midst of branqueamento, a 
national eugenics policy promoting European migration and cross-racial marriage as a grand 
project to design a white nation through the dilution of black blood (Loveman, 2009). 

By mid-century, the government was actively promoting the ideology of racial 
democracy, a patriotic, racism-denying ideology that reframes Brazil as a “racial paradise” with 
a single, mixed Brazilian race and presents multiraciality as a consequence of racial harmony 
(Bailey, 2009; Freyre, 1946; Telles, 2004). The 1964-1985 military dictatorship embraced the 
myth of racial democracy and brutally crushed dissidents, hampering racial justice movements. 


Today, racial democracy lives on; in response to the murder of Jodo Alberto Silveira Freitas, 
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Vice President Mourao declared “there is no racism” in Brazil (Camazano, 2020). However, this 
ideology is increasingly contested by the growing Black Movement, which promotes positive 
black identity among Afro-Brazilians and challenges racism and inequality (Bailey, 2009; Telles, 
2004). Some now consider racial democracy an aspiration: the promise of a raceless society 
(Bailey, 2009). 

Importantly, racial democracy grew in explicit recognition that Brazil did not implement 
de jure segregation and anti-miscegenation like the US, and frames Brazil as non-segregationist 
(Bailey, 2009; Telles, 2004). Consequently, de facto racial segregation is commonly assumed to 
be epiphenomenal, typically to class. This is the case with respect to housing, though racial 
residential segregation net of class remains sizable (Telles, 2004). This myth of a race-neutral 
and racially harmonious Brazil lends legitimacy to de facto racial segregation otherwise not 
readily explained. One might think of this as a legitimating logic-in-waiting, a pre-existing 
narrative that for many renders racial segregation tolerable regardless of its character. 

Meanwhile, race-based integration may face greater barriers to legitimacy than does 
racial segregation. Another important component of racial democracy, antiracialism, construes 
the discussion of race and racism as a racist, foreign intervention, making it improper to make 
racial ascriptions explicit (Guimaraes, 2001; Schwartzman, 2009). Ascriptions to darker racial 
groups are particularly improper; when ascribing someone in your presence who you see as 
black, it is polite to instead use a lighter category like moreno (Schwartzman, 2009). Brazilians 
see one another as raced, reliably categorizing photographs into racial groups (Bailey, 2009); this 
system of manners upholds the pretense of a single Brazilian race even as it implies the 
superiority of whiteness. Thus, racial democracy is a colorblind ideology that goes beyond US 


colorblind or laissez-faire racism (Bobo et al., 1997; Bonilla-Silva, 2006); it denies the existence 
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of race not only as an axis of oppression but as a socially meaningful category. This works 
against race-based classroom integration by calling into question the appropriateness of school 
administrators acknowledging color differences among students and explicitly considering those 
differences when organizing classrooms. 

However, race-based integration is not without its proponents. Most notably, public 
colleges began adopting racial affirmative action policies in 2001, a major win for the Black 
Movement. Telles and Paixao (2013) note that by 2010, “class quotas ha[d] become more 
common than race quotas, even though the debate ha[d] been almost entirely about race quotas” 
(p. 10). They argue that the strong opposition to race quotas specifically reflects denial of 
racism’s role in creating racial inequality in higher education. The logic of equalizing 
opportunity failed to legitimate race-based college integration despite awareness of stark racial 
inequities in college-going. Thus, while there are likely teachers, principals, and other school 
administrators who support proactive racial integration of classrooms as they do of universities, 
this position presumably faces an even tougher battle because classroom segregation has not 
been established as a social problem that would legitimate race-based classroom integration. 

Given their different ideological contexts, the US and Brazil are likely to have 
mechanisms of classroom segregation. Whereas racial segregation in US schools is liable to raise 
suspicion unless it adheres to a legitimating logic like tracking, unexplained segregation in Brazil 
is likely to be given the benefit of the doubt. While there is some evidence of non-tracking US 
schools integrating their classrooms, race-based integration in Brazil has questionable legitimacy 
owing to antiracialism. These factors make Brazil particularly susceptible to classroom 
segregation by chance, which can only be a substantial driver of racial segregation if unexplained 


and unintended racial segregation is accepted by school administrators. Otherwise, even a school 
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using random assignment could keep segregation by chance low by monitoring drafted 
classroom assignments for substantial racial imbalance and reassigning some students. 

Data 

I investigate classroom segregation in Brazil using Prova Brasil 2011-2017, a publicly-available 
dataset based on a biennial, nationwide student achievement test that includes a student survey 
with self-reported demographic information as well as identifiers linking students to their 
classrooms (which are stable across subjects), shifts, and school administrations (Instituto 
Nacional de Estudose Pesquisas Educacionais Anisio Teixeira, 2017). I use these identifiers to 
link Prova Brasil to Censo Escolar 2011-2017, a biennial national survey of teachers and 
principals (Instituto Nacional de Estudose Pesquisas Educacionais Anisio Teixeira, 2017). 
Collected at the end of the school year, this survey aims to include all Brazilian public-school 
5‘. and 9"-graders except those attending very small schools. 

I focus on public schools in which classroom segregation is possible, restricting the data 
to multi-classroom schools where a school is defined as the set of students eligible for 
assignment to the same set of classrooms (e.g., each shift within a school administration is a 
school). I also include schools only if all of their classrooms have race item response rates of at 
least 75%. The full sample includes 53,452 school-year observations in 5th grade and 32,068 in 
9th grade. (See Table | for more detail.) Overall, the samples include over 5.3 million students. 
Though they are not representative of all Brazilian 5th and 9th grade students, these samples 
cover a broad swath of the country and include thousands of distinct school systems. This 
breadth ensures that the present study identifies general patterns rather than local idiosyncrasies. 


Measures 
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Racial Segregation 
Tracking analyses often consider how classroom segregation becomes curriculum-wide 
segregation. Here, I focus on the production of classroom segregation itself, as students in 
Brazil’s public schools are grouped into classrooms that remain together for each subject. 

Unless otherwise stated, I measure racial segregation across classrooms using the 
Information Theory Index. This enables measuring segregation among more than two racial 
groups and decomposing segregation without bias (Reardon et al., 2000; Reardon & Firebaugh, 
2002). The Information Theory Index, denoted H, operationalizes segregation as the degree to 
which students are unevenly distributed across classrooms given a school’s population. Unless 
otherwise stated, the segregation measures reported here are multigroup segregation measures 
which simultaneously consider the segregation of all racial groups. H is based on entropy (F), a 
heterogeneity measure: 

u 1 (1) 
E= pa Pmln =) 

where p,,, is the proportion in group m (e.g., proportion white). H compares the heterogeneity of 
classrooms to that of their school, weighting the contribution of each group and classroom 


according to relative size: 


(2) 


where n; is the number of students in classroom j, N is the number of students in the school, p jm 
is the proportion of students in classroom j who are in group m, and E is the entropy of the 
school. H = 0 when every classroom is proportional to the school, and H = 1 when classrooms 


are completely segregated, meaning no racial group shares a classroom with any other. 
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Measuring racial segregation requires measuring race, an inherently fraught task. So as to 
stray as little as possible from students’ emic racial categories and capture the experiences of as 
many students as I can, I do not combine or drop categories. Instead, I measure segregation 
among all six racial categories offered in the Prova Brasil survey: white, parda/o (roughly, 
brown), preta/o (roughly, black), indigenous, amarela/o (yellow, similar to Asian), and “I don’t 
know.” It is not obvious that this is the ideal approach nor what alternatives would be preferable, 
so I err toward operationalizing race in a more emic and data-retentive way. 

Simulating Classroom Assignments 

I simulate classroom assignment under four conditions: random assignment, age sorting, strict 
achievement sorting, and noisy achievement sorting. Each simulation assigns the students in the 
observed data to hypothetical, equal-sized classrooms in their school-grade-year to model what 
would occur under a particular assignment regime. I estimate a baseline level of segregation for 
each school-grade-year so as to capture the segregation expected under each assignment 
condition. Random assignment and noisy achievement sorting include random variation. In these 
cases, I simulate 50 assignments in each school and take the mean to estimate the baseline. 

I use random assignment to proxy for the arbitrary segregation condition that would 
produce substantial segregation by chance. I model it by randomly assigning each student to a 
classroom in their school with an equal probability of being assigned to each classroom. 

Age sorting is a proxy for the process of sorting students based on age/grade distortion. 
Following the Prova Brasil’s wording, I operationalize age in 5" grade as age on the day of the 
survey and age in 9" grade as age at the end of the year. I rank students by age and sort them into 


equal-sized classrooms by rank. I randomly assign students whose ages are not observed. 
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I use both strict and noisy achievement sorting to proxy for the process of assigning 
students to classrooms based on achievement or perceived ability. I operationalize achievement 
as the average of Prova Brasil Portuguese and math scores. For strict sorting, I rank students on 
achievement and sort them into equal-sized classrooms by rank. One shortcoming is that scores 
are taken at the end of the school year. Further, schools may sort by perceived ability rather than 
achievement. Noisy achievement sorting models aim to address this. In these models, I add 
classical error to achievement such that the “noisy achievement” has a reliability of .75 asa 
measure of achievement. I then rank students on this measure and sort them into equal-sized 
classrooms by rank. I randomly assign students when achievement is not observed. 

Segregation Predictors 

School characteristics indicative of different classroom segregation processes include classroom 
segregation by age, Portuguese achievement, math achievement, and SES; stratification across 
racial groups by age, Portuguese achievement, math achievement, and SES; racial disparities in 
teachers’ experience, tenure status, and salary; and principal-reported sorting on age and 
achievement. For example, if age sorting is driving racial segregation, racial segregation should 
be positively associated with age segregation, racial stratification by age, and principal-reported 
age sorting. Racial segregation may also be shaped by tendencies of school administrations or 
particular places, so I also consider segregation levels in other shifts under the same school 
administration, segregation of the same school in adjacent years, and municipality, state, and 
region random-intercepts. (See Appendix A.) Some variables necessitate choices about how to 
measure differences across races. I report racial stratification findings using stratification among 
all groups because supplementary analyses show that findings do not differ for stratification of 


specific groups. I report racial disparities as white-nonwhite disparities because supplementary 
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analyses show that the findings do not differ when focusing on other groups (e.g., pardo- 
nonpardo). These supplementary analyses are available upon request. 

Methods 
The analysis occurs in three stages: describing the extent of classroom segregation; comparing 
how the observed data fit random assignment to how they fit other simulated classroom 
assignments; and comparing the association between classroom segregation and the random 
baseline to associations with indicators of other classroom sorting mechanisms. 
Describing Classroom Segregation 
To describe the extent of segregation, I compare Brazil to North Carolina, replicating the 
procedure Clotfelter et al. (2020) use to describe racial segregation in the US state. I follow 
Clotfelter et al. by estimating black/white (or preto/white) segregation as a population-weighted 
average of the Dissimilarity Index in places (counties or municipalities) that are at least four 
percent white and at least four percent black. Segregation is estimated between classrooms 
within schools and between schools within places, where “total segregation” is the sum of 
average within-school segregation and average between-school segregation. Whereas Clotfelter 
et al. look at between-school segregation within counties, I look at segregation within 
municipalities because there is no county-like unit available. Because Brazilian municipalities 
are smaller than North Carolina counties, between-school segregation and total segregation in 
Brazil are lower than they would be if comparable units were used. One drawback to using the 
Dissimilarity Index for these purposes is that it is not additively decomposable, biasing “total 


segregation” (Reardon & Firebaugh, 2002). It is unclear if this bias differs between places. 
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Simulation Analyses 

The second stage of the analysis compares classroom segregation by race in the observed data to 
that in data simulating the four hypothetical assignment processes outlined above, so as to assess 
whether the data is more consistent with random assignment or with pseudo-tracking assignment. 
I begin with a graphical analysis, comparing the observed LOWESS associations of racial 
segregation and each simulated baseline with the associations in the four types of simulations. 
Patterns differed little among random assignment draws and noisy achievement sorting draws, so 
the first draw was used. 

The graphical analysis is limited because the four baselines are correlated. To disentangle 
their associations with observed classroom segregation, I estimate 5"- and 9""- grade two-level 
hierarchical multiple regression models of schools within years, in which the set of classroom 
assignments specific to a school in a given grade and year is nested within years. Given the set of 
baselines X;, describing the expected segregation of classroom assignment i in year t under each 
assignment process, I model the racial segregation of the classroom assignment H;, as 


Hit = Yoo + Uoe + (Vo + Ut) Xie + Nit (3) 


ru~N(0,0%); [40] ~w ([P]. [222 2). 


where Yog is the year-average intercept, Uo; is a year-specific intercept, V9 is the set of year- 
average slopes on each baseline, wu, are year-specific slopes, and 7; is the total within-year error. 
The estimates of interest are Y.9, which are year-average associations, meaning that they are the 
means of the year-specific slopes. This is preferable to an OLS estimate, which would implicitly 
give more weight to the slopes of years with more observations when incorporating the four 
years of data into a single model. Note also that the baselines, X;,, are not centered such that 7g 


indicates the predicted amount of segregation when each baseline predicts no segregation. I 
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estimate these models in the observed data as well as in the 50 simulations of random assignment 
and the 50 of noisy achievement sorting. These estimates offer a picture of what would be 
observed if classrooms were assigned randomly or by a correlate of achievement. 

Regression Analyses 

The third stage of the analysis compares racial segregation’s association with the random 
baseline to its associations with a host of predictors, first by estimating year-average bivariate 
associations and then by estimating year-average multiple regression associations among a set of 
potential predictors identified in the bivariate analysis. 

I assess the estimated associations using metrics which are influenced by both the effects 
and the prevalence of practices with the goal of describing the pattern of segregation and 
assessing which potential mechanisms the patterns are most consistent with. This will provide 
insights into which mechanisms are least and most likely to be major sources of classroom 
segregation nationwide, helping clarify the big picture. Having little information on schools’ 
practices, I tackle this problem by making use of correlates that are hypothesized causes (e.g., 
sorting policies), mediators (e.g., achievement segregation), moderators (e.g., achievement 
stratification), and even effects (e.g., teacher disparities as an effect of lobbying for teachers) of 
the practices identified in the literature. As in the model described in Equation 3, the estimates 
use hierarchical linear models, stratified by grade, in which the set of classroom assignments 
specific to a school in a given grade and year is nested within years. Each model uses a group- 
mean-centered predictor X;,, describing the classroom assignment i in year t. 

Because the random baseline is mechanically correlated with classroom size and school 
racial diversity, racial segregation that occurs entirely by chance could also be spuriously 


associated with other segregation predictors. To assess whether observed associations could 
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occur under random assignment, I repeat each model 50 times, each with the values of H;, and 
Xt na simulation of random classroom assignment. I then average the y;, estimates to get a 
single counterfactual association. If racial classroom segregation is primarily due to chance, 
these simulated estimates should be similar to the observed data. Note, however, that I do not do 
this for the teacher disparities predictors because, a priori, they have no association with racial 
segregation given random classroom assignment. To assess the explanatory power of X;,, I report 


the percentage of total within-year variance explained when adding X;; to the model, 


Onut — 07 (4) 


2 J 
Onull 


%V = 100 * 


where o7 is taken from the bivariate model and o7,,., is the variance of 7; in a null model that 
excludes Xj. 

For the place predictors, I assess their role solely by their explanatory power because this 
captures the extent to which place-specific means vary across places relative to the total variance 
within years. I use a null model of classroom assignment i within place-year p within year t with 
place-year random intercepts Ug, and year specific intercepts V9: 

Hint = Yooo + Uop + Voot + Tipt (5) 
Tipt~N (0, a" )j Upp ~N (0, T 90); Voor~N (0, To00): 
To assess the explanatory power of the place-year random intercepts, I report the percentage of 
total within-year variance explained by adding the place-year level into the model. In other 
words, a” in Equation 4 is drawn from the model in Equation 5 while o7,,,, in Equation 4 
continues to be the variance of 7;; in a null two-level model of assignments within years. 
To assess the potential impact implied by y;9, I also report what I call the predicted 


contribution to segregation. This is the amount of segregation that would be attributed to the 
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predictor, as a percentage of the total classroom-level racial segregation in the model sample, if 
the model results described a causal relationship. Of course, the estimates are not causal, so the 
predicted contribution should not be confused with the actual contribution, which is unknown. 
Instead, the predicted contribution measure contextualizes the estimated associations by 
weighing both association strength and the prevalence/size of the predictor. Given a school 


characteristic X;,, I compute the predicted contribution as 


N,,E; 
ew, VES Y10Xit 
Y ye say 
t U NE it 


%S = 100 « 


where the numerator is the predicted contribution of X;, over all years t and the denominator is 
the total classroom segregation over all years t. 

The multiple regression model uses three-level HLM, stratified by grade, in which the set 
of classroom assignments specific to a school in a given grade and year is nested within 
municipality-years, which are nested within years. I model the racial segregation of the 


classroom assignment Hip; as 
Hint = Yooo + Uop + Voot + (Y.00 +Uy + Vot)Xipt + Tipt (7) 
Uop 0] [00 To. Voot 0] [*000 To.0 
re~NO,07)Lu]~N ([oleleo cl) Looe ~% College tol} 
Bo ) Up 0!’ LT T, Vot 01’ LT.00 T.0 
where X jpz is a predictor describing the classroom assignment i within municipality-year p in 
year f, Yooo 1s the year-average intercept, Up, is a municipality-year-specific intercept, Vogt 1S a 


year-specific intercept, Y 99 is a set of year-average slopes on the variables in Xjp;, Up is a set of 


p 
municipality-year-specific slopes, V9; is a set of year-specific slopes, and 7p; 1s the total within- 
year error. 


How Racially Segregated Are Classrooms? 
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US high schools are particularly known for classroom segregation by race due to tracking. A 
recent study by Clotfelter et al. (2020) measured racial and ethnic segregation using the 
Dissimilarity Index, D, within schools and between classrooms (i.e. classroom segregation) and 
segregation within counties and between schools (i.e. school segregation) in North Carolina. 
Figure 2 presents a comparison of their findings for white/black segregation among 4" and 10" 
graders in 2017 to my findings for white/preto segregation among Brazilian 5" and 9" graders in 
2017 following their procedure (see also table Al in Appendix B). In Figure 2, the gray portion 
of the bars is between-school segregation and the black portion is classroom segregation, where 
the sum is what Clotfelter et al. refer to as “total segregation.” Between-school segregation and 
total segregation in Brazil are likely underestimated here because the Brazilian analysis uses 
municipalities as the population of interest whereas the North Carolina analysis uses counties. 

Overall, Brazil’s 5 graders experienced more white/black segregation (D = .52) than 
North Carolina’s 4" graders (.49) while Brazil’s 9 graders experienced less (. 44) than North 
Carolina’s 10" graders (.53). In each case, the number of students who would need to be 
reassigned in order to balance classrooms and schools is roughly half of the maximum possible. 
This is despite substantially lower between-school segregation in Brazil; in both grade levels, 
Brazilian between-school segregation is just over half that of North Carolina (Brazil 5“ grade, 
D = .23; North Carolina 4" grade, D = .43; Brazil 9" grade, D = .18, North Carolina 10° 
grade, D = .33). 

Whereas North Carolina’s 4" graders are primarily segregated between schools with little 
classroom segregation (D = .06), half of the segregation among Brazil’s 5 graders is due to 
classroom segregation (D = .29). Brazil’s 9 graders are nearly as segregated as its 5“ graders 


(D = .25). In each grade analyzed, Brazil’s students are more segregated than North Carolina’s 
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high-schoolers (D = .20). Classroom segregation also contributes over half of the total 
segregation in both grades, whereas in North Carolina, it contributes at most 37.7%. 

Appendices C and D provide a richer description of the extent of classroom segregation. 
Appendix C describes the scale of racial segregation in the Brazilian public school system by 
decomposing the multi-group racial segregation between classrooms throughout the nation into 
units long-understood as segregated: regions, municipalities, and schools. In each year and grade, 
the plurality of racial segregation (38-42% in 5" grade, 30-35% in 9" grade) in Brazil’s multi- 
classroom public schools occurs between classrooms in the same school, not traditional suspects 
like regional, municipal, or school differences. Appendix D describes how each racial group 
contributes to multigroup classroom segregation. Each 9"-grade group and dyad of groups 
contributes similarly to segregation. Multigroup segregation in 5 grade is more driven by 
segregation of pardos and students who responded “I don’t know” — particularly segregation 
between those groups and whites and each of them — and less driven by segregation of Asian and 
indigenous students. After subtracting random baselines, all 9‘*-grade estimates are very low 
while segregation of 5"-grade pardos and “I don’t know” students — and especially segregation 
between those groups — contribute more to multigroup segregation. 

Random Assignment or Pseudo-Tracking 
Is the observed pattern of classroom segregation more consistent with random assignment or 
pseudo-tracking? Each panel in Figure 3 compares racial segregation under five conditions — the 
observed value and simulated values using random assignment, age sorting, strict achievement 
sorting, and noisy achievement sorting — to the simulated baseline for one of the four assignment 
processes. Thus, in each panel, one line is the observed pattern, one is the pattern for the 


condition corresponding to the X-axis, and three lines are non-corresponding conditions. 
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In the random baseline panel, all five lines track similarly. In the age sorting and strict 
achievement sorting panels, observed segregation has a smaller slope than the corresponding 
conditions, tracking better with the non-corresponding conditions. In the 5" grade noisy 
achievement sorting panel, the observed segregation line is not particularly more similar to any 
one condition, whereas, in the 9" grade panel, it tracks better with the random assignment and 
age sorting lines. Over the eight panels, the observed lines deviate most from the age sorting and 
strict achievement lines, tracking more similarly with the noisy achievement lines and, in 
particular, the random assignment lines. Observed segregation also tends to track less closely 
with all of the simulation lines in 5“ grade due to having a higher intercept. 

One challenge to distinguishing which simulated assignment processes fit the observed 
data better than others is that the simulated segregation levels are correlated, particularly for 
random assignment and noisy achievement sorting. Table 2 attempts to parse this by regressing 
observed segregation on the four simulated baselines. The 1°‘ and 4 columns present the 
findings for 5" and 9" grade, respectively. The 2"! and 5" columns present the average estimates 
and their 10-90% ranges over the 50 draws in the random assignment condition. This depicts 
what one would observe if all schools used random assignment. The 3 and 6" columns present 
similar estimates for the noisy achievement sorting condition. Net of the other baselines, the 
random baseline continues to have a strong association with observed segregation (y = 1.105 in 
5 grade, y = .917 in 9" grade). That is, an increase in the random baseline is associated with a 
similar increase (110.5% and 91.7%, respectively) in observed segregation. This comports with 
the near-one associations that would occur under random assignment. 

The other baselines have weak associations. Only the strict achievement sorting baseline 


is significant in 5" grade and only the age sorting baseline is significant in 9™ grade. In both 
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cases, the estimated association is about .05, or five percent of what it would be if all schools 
used the same sorting process as the simulations. These estimates are more similar to what would 
be observed under random assignment than under pseudo-tracking assignment. 

However, the pseudo-tracking baselines are typically more associated with observed 
segregation than would occur under random assignment. Likewise, the intercepts — particularly 
in 5" grade (y = .013) — are greater than would occur under random assignment. It is also 
noteworthy that, compared to random assignment, the within-year variance explained by the 
simulated baselines is less and some slopes vary more over time. 

Correlates of Non-Chance Segregation 
Simulated assignments imperfectly proxy for actual assignments. There are also classroom 
segregating mechanisms that are not pseudo-tracking, namely teacher steering and parent 
lobbying. Schools and their localities may also have different tendencies toward segregating net 
of demographic context and assignment policies due to preferences for racial segregation or 
integration. To further assess whether segregation by chance drives the classroom segregation in 
Brazil, I consider several correlates of non-chance segregation processes. 
Bivariate Analysis 
I begin by estimating bivariate associations between racial segregation and the set of correlates in 
the observed data. These associations might occur under random assignment, in which case the 
relationship would be incidental to the characteristics of students in the school rather than a 
signal of how segregation occurred. To assess this possibility, I also estimate the associations in 
simulations using random assignment (n=50). 

I contextualize these regression results in two ways: one, explanatory power as measured 


by the amount of within-year variance explained by a predictor and, two, impact as measured by 
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the percentage of segregation that would be attributable to the predictor if the model described a 
causal relationship. Figure 4 presents the variance explained by each variable along with the 10°- 
90" percentile range of the variance explained when simulating random assignment. Figure 5 
presents the 95% confidence interval for the predicted contribution of each variable along with 
the 10"-90" percentile range under random assignment. Importantly, this metric does not capture 
causality or describe the predictor’s true contribution; rather, it provides a sense of how big the 
estimated association is. Further details are provided in Appendix E. 

The strong association observed in the previous section between classroom segregation 
and the random baseline is also apparent in the bivariate analysis. Under random assignment, this 
association would be one; yet in both grades the association is statistically significantly greater 
than one. The random baseline explains 15.9% of the total variation in racial segregation in the 
5th grade sample and 23.6% in the 9th grade sample. In both cases, this is lower than would 
happen if all schools used random assignment. Under universal truly-random assignment, the 
predicted contribution metric for the random baseline is 100%. The metric for the observed data 
is not far off: 82.3% in 5th grade and 90.5% in 9th grade. 

Seven predictors relate to achievement sorting: the simulated strict and noisy 
achievement sorting baselines, an indicator of whether principals report achievement sorting, 
classroom segregation by Portuguese test scores, racial stratification by Portuguese test scores, 
classroom segregation by math test scores, and racial stratification by math test scores. Among 
them, the baselines and stratification predictors have stronger associations than they would under 
random assignment. Nonetheless, the stratification predictors explain little of the variation in 
racial segregation in either grade while the baselines explain meaningful variation in classroom 


segregation but no more than they would explain under random assignment. The estimated 
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associations for segregation and stratification variables each imply small but potentially 
meaningful impacts on segregation — as much as eight percent on the predicted contribution 
metric — but in no cases is the contribution more than two percentage points greater than under 
random assignment. Likewise, the associations with the sorting baselines imply large 
contributions to segregation but no more than would occur under random assignment. 

Four predictors relate to age sorting: the simulated age sorting baseline, whether 
principals report using age sorting, age segregation, and the age stratification of racial groups. 
None explain more variation than they would under random assignment. Additionally, none of 
the small predicted contributions implied by the estimated associations are more than two 
percentage points greater than the random baseline. 

I also consider the degree to which the classrooms and racial groups in schools are 
differentiated by SES using SES segregation and stratification predictors and racial disparities in 
teacher status as measured by teachers’ experience, salary, and tenure status. These variables are 
intended to indicate teacher steering and parent lobbying, though other sorting processes could 
produce associations between them and classroom segregation. In both grades, SES stratification 
and teacher disparities have precise, near-zero estimated association with classroom segregation. 
SES segregation has a stronger association; though it explains little variation in either grade, the 
predicted contribution is 6.1% in Sth grade and 4.1% in 9th grade. However, this is only 2.2% 
and 1.6% more than would have occurred under random assignment, respectively. 

To assess the role of place, I alternately included random intercepts at three geographic 
scales: municipalities, states, and regions. The percentage of variance explained indicates how 
much the mean racial segregation varies across places at a given scale. In both grades, little 


variation occurs at the state or regional levels, similarly to under random assignment. However, 
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there is substantial variation at the municipal level — about 10.6% of the total variation in Sth 
grade and 9.7% in 9th grade. This is 4.2 percentage points more than would occur under random 
assignment in 5" grade, and about 2.5 percentage points more in 9th grade. 

Finally, I included two measures to capture whether racial segregation is local to school 
administrations, by looking at segregation in peer shifts, and/or to the school itself, by looking at 
segregation in the preceding and following survey years. Segregation in adjacent years is 
minimally associated with segregation in a given year, as expected under random assignment. 
Segregation in peer shifts, though, is more associated with racial segregation than it would be 
under random assignment. The predicted contribution metrics are 18.1% (5 grade) and 15.5% 
(9" grade), or 8.6 and 6.2 percentage points more than would occur under random assignment. 
The explanatory power is smaller, though, at 2.3 and 1.4 percentage points more than simulated. 
Multiple Regression Analysis 
The municipality random intercepts are the only variable that explains substantially more 
variation than would occur under random assignment while peer shift segregation is the only 
variable that implies a substantially greater contribution to segregation than would occur under 
random assignment. Both capture differences in local tendencies and are likely to be correlated 
both with one another and with the random baseline. To assess whether they account for the 
random baseline’s association with classroom segregation, I consider a multiple regression 
analysis focused on the schools for which I observe peer shifts. 

In Table 3, models 1-3 present bivariate associations using each of the three variables. 
Model 4 loads the random baseline and municipality-year random intercepts while model 5 loads 
peer shift segregation along with municipality-year random intercepts. Model 6 is the full model 


with the random baseline, peer shift segregation, and municipality-year random intercepts. In 
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both grades, the random baseline is consistently associated with classroom segregation across the 
models, with an association near one. Peer shift segregation has a less robust association; when 
municipality-year random intercepts are included, the association becomes null in 5" grade and 
is flipped in 9". Additionally, while accounting for municipality differences in means explains 
6.3% (5") and 8.7% (9") of the within-year variation in classroom segregation, adding them to 
the random-baseline-only model explains little additional variation in either grade. 

Results 
Though the literature on racial classroom segregation has focused primarily on tracking in US 
high schools, Brazil’s non-tracking 5"- and 9""-grade classrooms are more racially segregated 
than North Carolina’s 10" grade classrooms. Classroom-level segregation is a primary source of 
overall racial segregation in Brazil’s school system, accounting for more segregation than 
regional-level and school-level segregation. How does this happen? 

Both simulation analyses and regression analyses using observed school features point to 
segregation by chance as a major contributor. In simulations, random assignment produces levels 
of racial segregation similar to pseudo-tracking practices like age and achievement sorting. The 
association between observed segregation and the random baseline is also strong enough that it 
would account for over 80% of 5" grade segregation and over 90% of 9" grade segregation were 
it a causally-identified estimate. 

I assess the possibility that this association is an artifact of other processes in two ways: 
simulating alternative approaches to assignment and analyzing the associations between 
observed segregation and indicators of non-chance assignments. The pattern of observed 
segregation is more consistent with simulations using random assignment than with pseudo- 


tracking simulations. Whereas racial segregation is strongly and robustly associated with the 
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expected value of segregation under random assignment, its associations with indicators of non- 
chance assignment practices are similar to random assignment. For example, the academic 
segregation that would be expected to accompany racial segregation if driven by pseudo-tracking 
practices typically has no more association with racial segregation than it would under random 
assignment. The exception is age segregation in 5" grade, which has a predicted contribution 
score of 5.2%, compared to 4.1% in simulations using random assignment. However, the score is 
much greater — 18.8% — in the age sorting simulation (analysis available upon request). 

Additionally, racial segregation is chaotic over time; after accounting for their random 
baselines, two schools with high and low segregation respectively in one year have similar 
segregation levels two years later. Likewise, segregation levels in peer shifts are not positively 
associated after accounting for municipal tendencies. This indicates that segregation is not driven 
by school features that are stable over short periods (.e.g, specific faculty, student composition, 
organizational culture, community practices, etc.). Classroom segregation is also geographically 
diffuse; state differences explain only 0.7 percentage points more variation in 5" grade and no 
more in 9" grade than they would under random assignment. 

Yet the evidence is clear that segregation by chance is not the sole source of classroom 
segregation. The random baseline explains less variation and implies a smaller contribution to 
segregation than it would if all schools used fully random assignment. Graphical analyses show 
that there is consistently more segregation in 5" grade than predicted in simulations of random 
assignment. Multiple regression analysis also shows that simulated achievement sorting in 5" 
grade and age sorting in 9" grade remain associated with observed segregation after accounting 
for the random baseline. Though these associations are weak, they would not occur under 


universal truly-random assignment. Municipality random intercepts also explain more variation 
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than they would under random assignment. Additionally, in some estimates the association 
between random baselines and observed segregation levels is significantly greater than one, 
indicating that some of the non-chance segregation is associated with the random baseline (e.g., a 
feedback effect). Finally, the patterns of classroom segregation in 9" grade are more consistent 
with random assignment than those in 5" grade, across all analyses. 

Limitations 

It is possible that flaws in simulations of assignment practices downwardly bias their estimated 
associations with observed segregation. Random assignment proxies for arbitrary assignment and 
end-of-year test scores proxy for achievement or perceived ability at the beginning of the year. 
Simulated assignments create classrooms that are as equal-sized within a school as possible, but 
schools may vary classroom size in ways that affect age or achievement segregation. However, if 
these flaws were distorting the overall picture, I would expect different findings in the analysis 
using correlates of non-chance segregation. For example, if achievement sorting were driving 
racial segregation, I would expect schools’ levels of achievement segregation and racial 
stratification of achievement to be stronger predictors of racial segregation. 

Some correlates of non-chance segregation have high missingness, meaning that the 
bivariate associations of different predictors are estimated with distinct subsamples of the 
population of schools. This study looks to identify coarse patterns rather than exact associations, 
which mitigates against the risk of non-random missingness, but it is still possible that the 
observed sample is substantially different from the population, which could lead to large 
disagreements between sample estimates and true associations in the multi-classroom public 
school population, particularly when samples are small due to non-response. This is primarily an 


issue for the SES predictors, as students who do not provide parental education information 
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could be in substantially different schools than those who do. It is also a concern when 
considering segregation in peer shifts, which is primarily missing due to schools without shift 
systems. In this case, the assumption is that the importance of school administrations implied by 
the correlation of segregation levels across school shifts is generalizable to schools without shift 
systems. It is less of a concern for the teacher disparity measures, which have high missingness 
primarily because the same teachers teach their respective subjects to both classrooms or the 
teachers in the grade do not vary with respect to the characteristic. 

Further, it must be stressed that the analyses do not describe causal relationships. The 
conclusions I draw about mechanisms of segregation are based on consistency and inconsistency 
with the patterns expected under known classroom segregation mechanisms. For example, 
sorting students by achievement may well contribute substantially to racial segregation when 
implemented; my finding that achievement segregation and racial stratification by achievement 
have small associations with racial segregation merely indicates that, even if the causal effect is 
high, achievement sorting is unlikely to be a major contributor to racial segregation nationwide. 

Finally, while the data implicate segregation by chance as the primary driver of racial 
classroom segregation, the findings are not dispositive. Proving segregation by chance would 
require identifying the causal role of stochasticity in classroom assignments. This is difficult for 
a number of reasons, including challenges to measuring stochasticity of classroom assignments 
in observed data and mechanical associations between the random baseline and classroom size 
and racial composition that cannot be fully controlled for without removing all variation in the 
random baseline. The extant literature provides no guidance for this task because it approaches 
segregation by chance as a hypothetical matter (i.e., how much segregation could be by chance?) 


rather than as a social matter (i.e., how much segregation is by chance?). 
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Discussion 
Racial classroom segregation is not specific to tracking contexts. Despite their abundance, the 
classroom segregation literature has rarely looked at non-tracking contexts. The findings 
presented here illustrate the need to cast a wider net: black/white classroom segregation in Brazil 
is on par with that in the US high schools that have captured researchers’ attention, and it appears 
to occur by chance, a mechanism that has received little attention. 

Though classroom segregation has garnered little interest in Brazil, it is clear that 
classroom assignments matter. Alves and Soares (2007, 2008) have demonstrated that learning 
gains vary greatly between same-school classrooms in Brazil. Botelho et al. (2015) identified 
widespread racial discrimination in grading in Brazil; if classrooms are racially segregated, this 
could amplify racial inequity. Moreover, classroom segregation by race reduces interracial 
contact (Moody, 2001). These concerns persist even when segregation occurs by chance. 

Segregation by chance lends itself to interpretations that strip schools of agency and, with 
it, responsibility: ifit happened by chance, how could it be helped? In the case of classroom 
segregation, the answer is: only too easily. Segregation by chance can only be a substantial driver 
of racial classroom segregation if schools choose to accept unexplained and unintended racial 
segregation. Otherwise, even a school using random assignment could keep segregation by 
chance low by monitoring drafted classroom assignments for substantial racial imbalance and 
reassigning some students before the schoolyear begins. 

The more interesting question might be: ifit is only by chance, why don’t schools just fix 
it? It is not due to racial ambiguity, as Brazilians reliably racially categorize one another (Bailey, 
2009). I offer an explanation rooted in racial ideology, arguing that unexplained racial 


segregation in schools may be more tolerated, and race-based integration less tolerated, in Brazil 
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than in the US. Brazil’s relationship to racial segregation is shaped by the absence of de jure 
segregation in the 20" century. This is a long-standing, government-promoted cause célébre 
used to promote the narrative that Brazil is a “racial paradise.” This ideology, called racial 
democracy, imagines Brazilians as a single mixed race and Brazilian society as free from racial 
difference. As a national myth, this ideology helps legitimate de facto racial segregation as not 
racial per se. Another consequence of racial democracy is antiracialism, a system of manners that 
hampers race-based integration efforts by discouraging explicit racial ascription. 

If Brazil’s racial classroom segregation by chance is due to denying the social reality of 
race, racial segregation by chance may be a feature of other Brazilian institutions as well; prior 
work has shown the potential for substantial occupational segregation by chance in other 
contexts (Bygren, 2013; Carrington & Troske, 1997). Additionally, racial segregation by chance 
may also be relevant to other societies, such as France (Beaman & Petts, 2020), where denial 
about the social reality of race and taboos around discussing race are widespread. In the US, the 
strong association between racial segregation and malicious intent works against the possibility 
of racial classroom segregation by chance, but this may change if colorblind, post-racial 
discourse becomes further entrenched. At present, economic segregation by chance seems more 
likely in the US. Economic inequality is often understood in racial terms (McDermott, 2006); 
norms minimize economic differences (e.g., the notion that nearly everyone is middle class); data 
on students’ economic characteristics are very coarse (i.e., free or reduced-priced lunch); and 
economic segregation is rarely problematized in everyday discourse. Thus, much as a colorblind 
racial ideology facilitates racial segregation by chance within Brazilian schools, the class-blind 


ideology and data framework in US schools may facilitate economic segregation by chance. 
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Tables 


Table 1. Descriptive Statistics of Schools in the Analytic Sample, Over All Years. 
Grade 5 Grade 9 
N Mean SD N Mean SD 


Racial Segregation 53,452 0.073 0.048 32,068 0.057 0.035 


School Characteristics 
# Students 53,452 58.61 25.09 32,068 68.19 31.99 
# Classes 53,452 2.42 0.82 32,068 2.49 0.90 
Average Classroom Size 53,452 24.01 4.78 32,068 2102 5.78 
% White 53,452 31.70 15.40 32,068 32.96 18.81 
% Parda/o 53,452 44.03 15.10 32,068 45.39 15.68 
% Preta/o 53,452 8.66 6.46 32,068 10.18 7.30 
% Indigenous 53,452 2.41 3.18 32,068 2141 2.99 
% Yellow 53,452 2.18 2.45 32,068 3.50 3.10 
% Don't Know 53,452 11.01 7.84 32,068 5.85 4.71 


Segregation Correlates 
Random Baseline 53,452 0.051 0.016 32,068 0.049 0.016 


Strict Ach Sorting Baseline 53,452 0.058 0.034 32,068 0.055 0.032 
Noisy Ach Sorting Baseline 53,452 0.055 0.022 32,068 0.053 0.021 
Test Score Sorting Policy 52,866 0.051 0.221 31,725 0.036 0.187 
Portuguese Segregation 53,435 0.039 0.062 32,044 0.034 0.050 
Portuguese Stratification 53,424 0.080 0.054 32,042 0.072 0.049 
Math Segregation 53,435 0.040 0.064 32,044 0.032 0.048 

Math Stratification 53,424 0.079 0.053 32,042 0.070 0.048 

Age Sorting Baseline 33,452 0.052 0.031 32,068 0.051 0.030 

Age Sorting Policy 52,866 0.347 0.476 31,725 0.366 0.482 

Age Segregation 49,773 0.082 0.098 31,190 0.084 0.114 

Age Stratification 49,764 0.146 0.126 31,188 0.115 0.100 

SES Segregation 6,684 0.037 0.050 25,210 0.033 0.045 

SES Stratification 6,679 0.079 0.062 25,209 0.085 0.066 


T Exp. Disparity 16,415 0.055 2219 5,743 0.034 1.247 

T Salary Disparity 13,620 0.003 0.270 4,136 0.003 0.192 

T Tenure Disparity 11,444 0.003 0.160 6,482 0.001 0.119 

Segregation in Peer Shift 12,228 0.069 0.045 4,030 0.055 0.035 
Segregation in Adjacent 

Years 18,256 0.072 0.045 8,858 0.056 0.033 


Note: Students are included in the analytic sample if they responded to the race question. Schools 
are included in the analytic sample if they are public schools within which all classes in the given 
grade have at least 75% of students responding to the race item and there are at least two classes. 
Correlates are missing due to non-response or inapplicability (e.g., if there is only one shift in the 
school building). Segregation, stratification, and teacher disparity variables are further restricted 
for comparability (see Appendix A). 


CLASSROOM SEGREGATION WITHOUT TRACKING | 44 


Table 2. Hierarchical Multiple Regression Model of Classroom Racial Segregation on Simulated Baselines in Observed Data 


and in Simulations of Random Classroom Assignment and Noisy Achievement Sorting. 


Grade 5 Grade 9 
Random Noisy Ach. Random Noisy Ach. 
eee Assignment soning CDSs Assignment Song 
Random Assignment Baseline 1.105 0.998 -0.000 0.917 1.002 0.000 
(1.033,1.177) (0.985,1.013) (-0.021,0.018) (0.875,0.960) (0.980,1.024) (-0.023,0.023) 
Noisy Ach. Sorting Baseline 0.007 -0.000 1.000 0.066 -0.001 1.000 
(-0.061,0.075) (-0.018,0.015) (0.982,1.021) (-0.002,0.133) (-0.019,0.018) (0.975,1.028) 
Strict Ach. Sorting Baseline 0.052 -0.001 0.000 0.014 -0.000 0.000 
(0.029,0.076) (-0.009,0.010) (-0.011,0.010) (-0.017,0.045) (-0.012,0.012) (-0.014,0.016) 
Age Sorting Baseline 0.023 -0.000 0.000 0.053 -0.001 0.000 
(-0.015,0.061) (-0.007,0.006) (-0.008,0.006) (0.037,0.068) (-0.009,0.010) (-0.010,0.008) 
Intercept 0.013 -0.000 -0.000 0.005 -0.000 -0.000 
(0.011,0.015) (-0.001,0.000) (-0.001,0.001) (0.004,0.006) (-0.001,0.001) (-0.001,0.001) 
Variance Explained (%) 0.161 0.287 0.444 0.239 0.310 0.457 
(0.282,0.293) (0.441,0.449) (0.304,0.317) (0.452,0.463) 
# of Observations 53452 53452 53452 32068 32068 32068 
Year Variation SD (p-value) SD (90-10 range) SD (p-value) SD (90-10 range) 
Random Assignment Baseline 0.063 0.017 0.021 0.026 0.024 0.020 
(0.013) (0.004,0.033) (0.007,0.039) (>.500) = (0.008,0.045) (0.008,0.03 1) 
Noisy Ach. Sorting Baseline 0.057 0.019 0.023 0.057 0.023 0.027 
(0.032) (0.006,0.035) (0.009,0.039) (0.033) (0.007,0.042) (0.009,0.051) 
Strict Ach. Sorting Baseline 0.013 0.010 0.011 0.025 0.011 0.015 
(>.500) = (0.002,0.016) (0.003,0.019) (0.078) (0.004,0.020) (0.005,0.026) 
Age Sorting Baseline 0.036 0.006 0.008 0.009 0.008 0.009 
(0.000) (0.002,0.013) (0.002,0.015) (>.500) — (0.003,0.017) (0.004,0.015) 
Intercept 0.002 0.001 0.001 0.001 0.001 0.001 


(0.158)  (0.000,0.001) (0.000,0.001) (>.500)  (0.000,0.001) (0.000,0.001) 
Note: Each column presents the results of a 2-level HLM model with years at level 2 such that each coefficient is the tendency in 
the average year in 2011, 2013, 2015, and 2017. 
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Table 3. Hierarchical Multiple Regression Models of Classroom Racial Segregation, by Grade. 


CQ) (2) (3) (4) (5) (6) 
Grade 5 
Intercept 0.065 0.065 0.065 0.065 0.065 0.065 
(0.061,0.068) (0.061,0.068) (0.063,0.067) (0.062,0.067) (0.063,0.067) (0.062,0.067) 
Random Baseline 1,173 -- -- 1.129 -- 1.081 
(1.108,1.238) -- -- (0.998,1.261) -- (0.962,1.200) 
Segregation in Peer Shift -- 0.218 -- -- 0.026 -0.020 
-- (0.194,0.242) -- -- (-0.021,0.073) (-0.061,0.021) 
Muni-Year Random Intercepts x x x x 
Variance Explained (%) 16.6 4.7 6.3 19.0 13.4 24.3 
# of Observations 5778 5778 5778 5778 5778 5778 
# of Municipality-Y ears -- -- 260 260 260 260 
Grade 9 
Intercept 0.048 0.048 0.050 0.050 0.050 0.050 
(0.046,0.050) (0.046,0.050) (0.048,0.052) (0.047,0.052) (0.048,0.052) (0.048,0.052) 
Random Baseline 1.085 -- -- 1.006 -- 0.936 
(0.933,1.237) -- -- (0.851,1.161) -- (0.776,1.097) 
Segregation in Peer Shift -- 0.146 -- -- -0.204 -0.200 
(-0.279,- (-0.263,- 
-- (0.124,0.167) -- -- 0.129) 0.136) 
Muni-Year Random Intercepts x x x x 
Variance Explained (%) 26.7 2.0 8.7 26.1 21.1 34.6 
# of Observations 1082 1082 1082 1082 1082 1082 
# of Municipality-Y ears -- -- 160 160 160 160 


Note: Each column presents the results of a 3-level HLM model with municipality-years at level 2 and years at level 3 such that 
each coefficient is the tendency in the average municipality in the average year in 2011, 2013, 2015, and 2017. Each sample is 
restricted to observations for which segregation in peer shift is observed and municipalities with at least 10 such observations. 
Variance explained is the percentage reduction in level-1 variance as compared to an empty 2-level model of observations within 
years. Coefficient variation is in standard deviation units with p-values in parentheses. 
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Figures 


Figure 1. Distribution of Classroom-Level Racial Segregation by Simulated Classroom 
Assignment Processes, Over All Years and Grades. 
———— Random Assignment 
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Note: Kernel density plot using the Epanechnikov kernel. Random assignment and noisy 
achievement sorting lines are each for the distribution of one draw per school-year-grade. 
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Figure 2. White/Black Within- and Between-School Segregation in Brazil and North Carolina 
in 2017. 
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Note: North Carolina estimates from Clotfelter et al. (2020). Segregation estimates use the 
Dissimilarity Index. 
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Figure 3. Relationships between Observed and Simulated Racial Segregation, by Grade and 
Simulated Baseline, Over All Years. 
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Note: Lines are LOWESS lines. Lines for segregation under random assignment and segregation 
under noisy achievement sorting are each the set of a single draw per school-year in the grade. 


LOWESS lines vary little across draws such that plots including lines for all 50 draws per 
school-year-grade are similar to those using one draw. 
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Figure 4. Within-Year Variance Explained by Predictor, in the Observed Data and When 


Simulating Random Assignment, by Grade. 
Grade 5 Grade 9 


fe 


Random Baseline Random Baseline 


Strict Ach. Sorting Baseline Strict Ach. Sorting Baseline 
Noisy Ach. Sorting Baseline Noisy Ach. Sorting Baseline 
Achievement Sorting Policy Achievement Sorting Policy 
Portuguese Segregation Portuguese Segregation 
Portuguese Stratification Portuguese Stratification 
Math Segregation Math Segregation 
Math Stratification Math Stratification 
Age Sorting Baseline Age Sorting Baseline 
Age Sorting Policy Age Sorting Policy 
Age Segregation Age Segregation 
Age Stratification Age Stratification 
SES Segregation SES Segregation 
SES Stratification SES Stratification 
T Experience Disparity (W-NVW) T Experience Disparity (W-NW) 
T Salary Disparity (W-NW) T Salary Disparity (W-NW) 
T Tenure Disparity (W-NW) T Tenure Disparity (W-NW) 
Municipality Random Effects Municipality Random Effects 


State Random Effects State Random Effects 


Region Random Effects Observed Region Random Effects Observed 
Value Value 
Segregation in Peer Shift 90-10% Segregation in Peer Shift 90-10% 
Range in Range in 
Segregation in Adjacent Years Simulations Segregation in Adjacent Years Simulations 
0 10 20 30 0 10 20 30 
Variance Explained (%) Variance Explained (%) 


Note: Variance explained is the percentage of within-year variance explained by the predictor. 
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Figure 5. Predicted Contribution by Predictor, in the Observed Data and When Simulating 
Random Assignment, by Grade. 
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Note: Predicted contribution is the amount of segregation that would be attributed to the 
predictor (as a percentage of the total classroom-level racial segregation in the model sample) if 
the model results described a causal relationship, giving a sense of the size of the estimated 
association. This is not the actual contribution to segregation as the model does not identify the 


causal effect of the predictor. 
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Appendix A 
Constructing Segregation Correlates 
We measure segregation by achievement, age, and SES as we do segregation by race, 
operationalizing SES as the student-reported educational attainment of their mothers and fathers 
using whichever one is greater. 

We also use H to measure the racial stratification by each of these characteristics within 
schools. Racial stratification by a characteristic is the degree to which that characteristic is 
unevenly distributed across racial groups, indicating the extent to which the distributions within 
the different racial groups do not overlap. We capture this by measuring racial stratification as the 
“segregation” of the characteristic across racial groups, as opposed to classrooms. One concern 
with the stratification measures is that using each racial group could dampen the signal when one 
group is stratified from the rest. In supplemental analyses, we included stratification measures that 
used binary race schemes comparing one racial group to all others (e.g., whites vs nonwhites), for 
each racial group. These analyses, which are available upon request, did not substantively alter our 
findings. 

In addition to the general sample restrictions, we further restrict the samples for analyses 
using these measures to only include schools in which there are multiple classes with at least 25 
percent response rates to the relevant item. Additionally, stratification predictors are only included 
if the school has students from multiple racial groups. 

We measure racial disparities in teacher status by considering teachers’ experience, tenure 
status, and salary, as reported by teachers in Censo Escolar. Tenure status is a binary indicator of 
whether a teacher has tenure at the school. Teacher salary and experience are originally reported 


in bins. We interpolate a continuous measure by using interval regression to fit a normal 
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distribution y’ to the original measure y, giving observations within a bin the mean value of y’ 
when it falls within the same bin. This is the expected value for a randomly chosen teacher given 
that y is normally distributed. 

Given a characteristic, C,;, of teacher t of classroom j, we measure teacher disparities by 
averaging each classroom’s teachers’ characteristics then taking the difference in means between 


whites (W) and nonwhites (NW) in these classroom values: 


1vcW, 1 WNW, (Al) 
a Oa 


We further restrict the samples for analyses using teacher disparities to schools in which 
there are survey responses from math and Portuguese teachers (which may be the same teacher), 
the relevant characteristic is reported for each teacher surveyed, mean values vary across 
classrooms, and there are at least five white and five nonwhite students in the school. 

One concern with focusing on white-nonwhite disparities is that other disparities could be 
more important, particularly in schools with few white students. In supplemental analyses, we 
included teacher disparities measures focused on pardos, pretos, and students who responded “‘T 
don’t know”. These analyses, which are available upon request, did not substantively alter our 
findings. 

The two predictors capturing school assignment policy are drawn from the same item in 
the Censo Escolar principal surveys, which asks principals how they determine classroom 
assignments. Possible replies include achievement homogeneity, achievement heterogeneity, age 
homogeneity, age heterogeneity, other, and none. The measures of achievement sorting policy and 
age sorting policy are indicators of whether the principals reported achievement homogeneity and 


age homogeneity, respectively. 
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Appendix B 


Brazil-North Carolina Comparison Table 


Table Al. White/Black within- and between-school segregation in Brazil and North Carolina 


in 2017. 
Brazil 
Grade 5 
D 
Between-School 0.23 
Segregation 
Within-School 0.29 
Segregation 
Total 0.52 


% 
44.6% 


55.4% 


North Carolina Brazil North Carolina 
Grade 4 Grade 9 Grade 10 
D % D % D % 
0.43 87.8% 0.18 41.8% 0.33 62.3% 
0.06 12.2% 0.25 58.2% 0.20 37.7% 
0.49 0.44 0.53 


Note: North Carolina estimates from Clotfelter et al. (2020). Segregation estimates use the 


Dissimilarity Index. 
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Appendix C 
Scale Decomposition 
Unlike most segregation measures, the index H is additively decomposable, allowing for the 
unambiguous attribution of segregation to its within-unit and between-unit components (Reardon 
et al., 2000; Reardon & Firebaugh, 2002). Given K schools in municipality L, the segregation 
across all classrooms J in L, Hjc,, is the sum of a between-school within-municipality component, 
Hxcz, and a within-school between-classrooms component that is the weighted average of the k 


within-school segregation values Hjcx 


New Ex (A2) 
Ajer = Agen + WE, Use 


where E;,; and E;, are the entropy of school k in municipality L and the entropy of the municipality 
L, respectively, and similarly N,; and N, are respectively the total student populations of school 
k in municipality L and of municipality L. Likewise, segregation between classrooms within a 
state can be decomposed into its between-municipality and within-municipality components, and 
so on. 

We first decomposed the nationwide racial segregation between classrooms into several 
nested institutional units: regions, states, municipalities, municipalities X administrations (i.e., 
state schools vs municipal schools within a municipality), school administrations, schools and 
classrooms. For simplicity, our analysis collapses units to focus on the institutional boundaries that 
were found to be most consequential. In each year and grade, the plurality of racial segregation in 
Brazil’s multi-classroom public schools occurs between classrooms in the same school, not 


traditional suspects like regional differences, municipality differences within regions, or school 


ONLINE SUPPLEMENTARY MATERIALS | 55 


differences within municipalities. Classroom-level segregation accounts for roughly 40 percent of 
the segregation in grade 5 and roughly 30-35 percent in grade 9. 

However, our data set is limited to public schools. It is unclear how segregated private 
sector classrooms are or how much segregation occurs between sectors. Brazil is known for its 
relatively large and disproportionately white private sector, so it is possible Figure Al overstates 
the role of classroom-level segregation. One solution is to provide a lower bound on the proportion 
of segregation that occurs within schools. Suppose the private sector was all-white and every 
school-grade had multiple classrooms. Given 13-16% private school enrollment in both grades 
according to Sinopse Estatistica da Educacao Basica (Instituto Nacional de Estudose Pesquisas 
Educacionais Anisio Teixeira, 2011, 2013, 2015, 2017), we simulate the proportion of segregation 
at the classroom level within each grade and year under this extreme hypothetical. This provides 
a lower bound estimate of the contribution of classroom-level segregation for all multi-classroom 
schools. The role of classroom segregation is diminished substantially, but it remains large; in 
2011, 2013, 2015, and 2017, the percentage of segregation at the classroom-level in Sth grade 
would reduce to 28%, 27%, 25%, and 26%, respectively. In 9th grade, the lower bounds are 22%, 
21%, 19%, and 19%, respectively. Even under the most extreme assumptions, classroom-level 
segregation is an important component of the segregation among all multi-classroom schools in 


both 5th and 9th grade. 
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Figure Al. Racial Segregation Decomposed by Segregation Scale, by Year and Grade. 
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Note: Total segregation between classrooms across the nation is reported at top. 
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Appendix D 
Which Racial Groups Are Segregated? 

Using a multigroup segregation measure captures the racial segregation experienced by more 
students at the expense of flattening the segregation of particular groups and of particular dyads of 
groups into a single measure. To better understand how each racial group and racial group dyad 
contributes to multigroup segregation, we follow Reardon et al.’s (2000) between-group 
decomposition of H. Given six racial groups A, B, C, D, E, and F, the proportion of multigroup 
classroom segregation of the six groups, H@ = H4\8\C\P\F\F that is due to the segregation of 
group A from group B is 


A\B [JA\B 
wae = cae (A3) 
EMHM 


where 7t,4p 1s the proportion of the school population that is in either group A or group B. Similarly, 
one can compute the proportion of segregation that is due to segregation between group A and all 
non-A students, in which case z = 1. 

Drawing from Eq. A2, the amount of all classroom segregation in the nation that is due to 


the classroom-level segregation of groups A and B is 


K 
Nxt E xs 
HA\B = a et pA HM (A4) 


where HM ick 1S the multigroup segregation among classrooms j in school k, P, A\B is the proportion 


NetEkb kL 


of multigroup segregation due to segregation among groups A and B in school k, _ Weights 


segregation by diversity and population, and the sum is taken over all K schools in the nation. 
We compute these values in each grade and year for each dyad as well as for each racial 


group using all other students as the comparison, then average over years within each grade. Note 
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that these values do not sum to the total segregation value (e.g., .166 in grade 5 in 2011) because 
the segregation the segregations of different groups from one another are not discrete phenomena. 
Additionally, some of the pattern would occur under random assignment. To isolate the pattern 
that would not occur under random assignment, we repeat this analysis in simulations using 
random assignment (N=50), subtracting the average result in the simulations from the observed 
results. Figure A2 presents the group decomposition without accounting for random assignment 


while Figure A3 presents them after removing the values under random assignment. 
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Figure A2. Dyad-Specific Classroom Segregation Contribution to Total Segregation between 
Classrooms across the Nation, by Grade. 
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Note: White-highlighted boxes on the diagonal refer to segregation between the given group and 
all others (i.e. the white X white box reports the contribution from white-nonwhite segregation). 
“Unknown” is used as shorthand for students who responded “I don’t know”. 
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Figure A3. Dyad-Specific Classroom Segregation Contribution to Total Segregation between 
Classrooms across the Nation Net of the Average Value under Random Assignment, by Grade. 
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Note: White-highlighted boxes on the diagonal refer to segregation between the given group and 
all others (i.e. the white X white box reports the contribution from white-nonwhite segregation). 
“Unknown” is used as shorthand for students who responded “I don’t know”. 
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Appendix E 
Bivariate Association Tables 


Table A2. Fifth Grade Bivariate Relationships between Each Predictor and Racial Segregation in Observed Data and in Simulations 


of Random Classroom Assignment. 


Bivariate Association Variance Explained (%) Pred. Contribution to Seg. (%) 
Observed Simulations | Observed —_ Simulations Observed Simulations 
Random Baseline 1.184 0.996 15.874 28.721 82.298 100.502 
(N = 53452) (1.174,1.195) — (0.985,1.006) (28.223,29.258) (81.572,83.024) (99.422,101.463) 
Strict Ach. Sorting Baseline 0.307 0.220 4.734 6.193 24.407 25.454 
(N = 53452) (0.278,0.335) — (0.216,0.226) (5.927,6.475) (22.145,26.668) (24.909,26.083) 
Noisy Ach. Sorting Baseline 0.677 0.532 9.463 14.887 1.311 58.582 
(N = 53452) (0.633,0.720) = (0.525,0.539) (14.529,15.269) (47.975,54.646) (57.792,59.344) 
Achievement Sorting Policy 0.003 0.001 0.017 0.011 0.209 0.108 
(N = 52866) (0.002,0.004) — (0.000,0.002) (0.000,0.022) = (0.117,0.302) (0.050,0.159) 
Portuguese Segregation 0.079 0.102 1.085 0.668 4.352 3.664 
(N = 53435) (0.067,0.092) — (0.092,0.111) (0.550,0.773) — (3.668,5.036) (3.312,3.973) 
Portuguese Stratification 0.052 0.034 0.370 0.371 S277 4.959 
(N = 53424) (0.036,0.068) — (0.030,0.037) (0.304,0.446)  (3.647,6.907) (4.502,5.494) 
Math Segregation 0.078 0.102 1.119 0.678 4.381 3.675 
(N = 53435) (0.064,0.092) — (0.095,0.112) (0.562,0.812)  (3.603,5.158) (3.413,3.999) 
Math Stratification 0.057 0.037 0.431 0.444 129 5.433 
(N = 53424) (0.043,0.072) — (0.034,0.040) (0.373,0.515)  (4.281,7.178) (4.987,5.837) 
Age Sorting Baseline 0.322 0.256 4.560 7.168 23.163 26.745 
(N = 53452) (0.284,0.360) = (0.251,0.260) (6.877,7.407) (20.449,25.877) (26.231,27.167) 
Age Sorting Policy -0.001 -0.001 0.026 0.022 -0.646 -0.574 
(N = 52866) (-0.003,-0.000) (-0.001,-0.001) (0.011,0.034)  (-1.222,-0.069) —_(-0.781,-0.368) 
Age Segregation 0.045 0.035 0.836 0.641 5.200 4.104 
(N = 49773)  (0.038,0.052) — (0.032,0.037) (0.567,0.726) (4.402,5.998) (3.836,4.380) 
Age Stratification 0.006 0.007 0.026 0.090 1.196 1.946 
(N = 49764) (0.004,0.009) — (0.006,0.008) (0.067,0.118)  (0.732,1.660) (1.638,2.242) 
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Bivariate Association Variance Explained (%) Pred. Contribution to Seg. (%) 
Observed Simulations | Observed Simulations Observed Simulations 
SES Segregation 0.132 0.082 1.786 0.844 6.145 3.930 
(N = 6684) (0.122,0.142) — (0.067,0.098) (0.556,1.150)  (5.678,6.612) (3.211,4.691) 
SES Stratification 0.022 0.021 0.132 0.216 1.901 2.630 
(N = 6679) (-0.000,0.043)  (0.015,0.027) (0.109,0.367)  (-0.006,3.808) (1.952,3.384) 
T Experience Disp. (W-NW) -0.000 0 0.043 0 -0.005 0 
(N = 16415) (-0.001,0.001) (-0.053,0.044) 
T Salary Disp. (W-NW) 0.002 0 0.074 0 0.007 0 
(N = 13620) (-0.003,0.008) (-0.010,0.024) 
T Tenure Disp. (W-NW) -0.002 0 0.034 0 -0.007 0 
(N = 11444) (-0.009,0.005) (-0.028,0.015) 
Municipality Intercepts -- -- 11.941 6.872 -- -- 
(N = 53452) (6.359,7.360) 
State Intercepts -- -- 2.308 1.619 -- -- 
(N = 53452) (1.528,1.765) 
Region Intercepts -- -- 0.832 0.200 -- -- 
(N = 53452) (0.155,0.241) 
Segregation in Peer Shift 0.184 0.097 3.366 1.049 18.119 9.568 
(N = 12228) (0.163,0.206) = (0.078,0.115) (0.699,1.439)  (16.004,20.234)  (7.672,11.341) 
Seg. in Adjacent Years 0.014 0.021 0.009 0.050 1.360 2.054 
(N = 18256) (0.009,0.018) — (0.005,0.032) (0.000,0.104) — (0.923,1.798) (0.522,3.207) 


Note: Each cell presents estimates from either a single model or several models. All estimates are from HLM models reporting year- 
average bivariate associations. Output for observed data show estimates with 95% confidence intervals using robust standard errors. 
Output for simulated random classroom assignment (n=50) show mean estimates with the 90-10% range of estimates. In the case of 
teacher disparities, we know a priori that there is no association given random assignment. Variance explained is the percentage of 
within-year variance explained by the predictor. Predicted contribution to segregation is the amount of segregation that would be 
attributed to the predictor (as a percentage of the total classroom-level racial segregation in the model sample) if the model results 
described a causal relationship, giving a sense of the size of the estimated association. This is not the actual contribution to segregation 
as the model does not identify the causal effect of the predictor. 
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Table A3. Ninth Grade Bivariate Relationships between Each Predictor and Racial Segregation in Observed Data and in Simulations 
of Random Classroom Assignment. 


Random Baseline 
(N = 32068) 
Strict Ach. Sorting Baseline 
(N = 32068) 
Noisy Ach. Sorting Baseline 
(N = 32068) 
Achievement Sorting Policy 
(N = 31725) 
Portuguese Segregation 
(N = 32044) 
Portuguese Stratification 
(N = 32042) 
Math Segregation 
(N = 32044) 
Math Stratification 
(N = 32042) 
Age Sorting Baseline 
(N = 32068) 
Age Sorting Policy 
(N = 31725) 
Age Segregation 
(N = 31190) 
Age Stratification 
(N = 31188) 
SES Segregation 
(N = 25210) 
SES Stratification 
(N = 25209) 


Bivariate Association 


Observed Simulations 
1.046 0.999 
(1.037,1.055) (0.985,1.015) 
0.288 0.244 
(0.270,0.306) (0.238,0.250) 
0.645 0.575 
(0.632,0.659) (0.564,0.585) 
0.001 0.000 
(-0.002,0.003) (-0.001,0.002) 
0.065 0.120 
(0.058,0.072) (0.106,0.133) 
0.072 0.048 
(0.063,0.081) (0.043,0.052) 
0.073 0.120 
(0.062,0.083) (0.104,0.136) 
0.074 0.050 
(0.065,0.083) (0.046,0.054) 
0.329 0.282 
(0.314,0.344) (0.275,0.289) 
-0.002 -0.002 
(-0.003,-0.001) (-0.002,-0.002) 
0.017 0.039 
(0.016,0.018) (0.035,0.043) 
0.022 0.012 
(0.020,0.024) (0.010,0.014) 
0.075 0.056 
(0.071,0.080) (0.049,0.064) 
0.015 0.010 
(0.011,0.019) (0.007,0.014) 


Variance Explained (%) 


Observed 


23092 


7.243 


15.216 


0.008 


0.880 


1.056 


1.022 


1.073 


8.300 


0.118 


0.319 


0.395 


1.004 


0.079 


Pred. Contribution to Seg. (%) 


Simulations Observed Simulations 
30.990 90.528 100.197 
(30.393,31.677) (89.750,91.306) (98.766,101.806) 
7.509 28.038 27.554 
(7.164,7.811) (26.287,29.789) (26.818,28.232) 
7525 60.336 62.341 
(17.058,17.991) (59.073,61.599) (61.102,63.424) 

0.008 0.053 0.028 
(0.000,0.012) = (-0.112,0.217) (-0.041,0.110) 
0.804 4.061 3.998 
(0.632,1.018) (3.622,4.500) (3.550,4.454) 
0.671 8.246 6.317 
(0.542,0.803) (7.232,9.261) (5.662,6.903) 
0.806 4.275 3.996 
(0.600, 1.033) (3.664,4.887) (3.478,4.548) 
0.700 8.218 6.405 
(0.607,0.817) (7.205,9.231) (5.984,6.961) 
8.635 29.561 29.382 
(8.278,8.966) (28.200,30.922) (28.608,30.150) 
0.136 -1.404 -1.550 
(0.097,0.175)  (-2.208,-0.601) —(-1.807,-1.280) 
0.582 2.714 3.480 
(0.464,0.696) (2.592,2.835) (3.146,3.809) 
0.172 4.141 2.578 
(0.121,0.230) (3.848,4.435) (2.215,2.977) 
0.525 4.326 2.943 
(0.389,0.672) (4.073,4.580) (2.594,3.344) 
0.064 1.999 1.594 


(0.031,0.111) 


(1.426,2.573) 


(1.078,2.215) 
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Bivariate Association Variance Explained (%) Pred. Contribution to Seg. (%) 
Observed Simulations | Observed Simulations Observed Simulations 
T Experience Disp. (W-NW) 0.000 0 -0.010 0 0.004 0 
(N = 5743) (-0.001,0.001) (-0.030,0.038) 
T Salary Disp. (W-NW) 0.000 0 -0.009 0 0.001 0 
(N = 4136) (-0.005,0.005) (-0.029,0.031) 
T Tenure Disp. (W-NW) -0.006 0 0.018 0 -0.023 0 
(N = 6482) (-0.012,-0.000) (-0.045,-0.002) 
Municipality Intercepts -- -- 10.857 7.116 -- -- 
(N = 32068) (6.404,7.942) 
State Intercepts -- -- 2.381 2.520 -- -- 
(N = 32068) (2.342,2.698) 
Region Intercepts -- -- 1.115 0.857 -- -- 
(N = 32068) (0.722,0.971) 
Segregation in Peer Shift 0.157 0.095 2.621 1.178 15.545 9.374 
(N = 4030) (0.126,0.188) — (0.062,0.125) (0.520,1.801)  (12.459,18.630) (6.146,12.426) 
Seg. in Adjacent Years 0.013 0.020 0.009 0.063 1.271 2.029 
(N = 8858) (0.007,0.018) —_(0.003,0.044) (0.000,0.181) — (0.707,1.836) (0.268,4.328) 


Note: Each cell presents estimates from either a single model or several models. All estimates are from HLM models reporting year- 
average bivariate associations. Output for observed data show estimates with 95% confidence intervals using robust standard errors. 
Output for simulated random classroom assignment (n=50) show mean estimates with the 90-10% range of estimates. In the case of 
teacher disparities, we know a priori that there is no association given random assignment. Variance explained is the percentage of 
within-year variance explained by the predictor. Predicted contribution to segregation is the amount of segregation that would be 
attributed to the predictor (as a percentage of the total classroom-level racial segregation in the model sample) if the model results 
described a causal relationship, giving a sense of the size of the estimated association. This is not the actual contribution to segregation 
as the model does not identify the causal effect of the predictor. 


